Intro for Lochlan

This is just a document to kinda display some of our data and preliminary results. It’ll give you an idea of what we have, what we need to tinker with, and what we still should do.

Also, I realize that much of this might need my explanation, or we should go over it together. Writing EVERYTHING out would just take too long. This is like a visual aid, really.

Runs that definitely need some polishing up:

  • M.bicolor4: clusters weird in PCA
  • Omajor1_B: not really stop-start
  • Ohaddoni5_B: a biiig pause
  • Ohaddoni4_B: last half is a run
  • Ohaddoni2_B: literally one burst
  • Ohaddoni2_A: Hmmmm, kinda irregular but kinda good
  • Mluctuosa6_A: big pause
  • Mluctuosa5_A: needs smoothing/increase in threshold
  • Mluctuosa4_D: sucks
    • 4_C: big pauses
    • 4_B: big pause
    • 4_A: uggggh
  • Mluctuosa1_A & B: a bit of noise
  • Mbicolor4_D: hmm. pause and needs smoothing/increase in threshold
  • Mbicolor1_A: HUGE pause
INTERVAL.DB <- NULL
stats <- TrajsMergeStats(trajs, stat_generator)
rownames(stats) <- names(trajs) %>% unlist()
stats$species <- row.names(stats) %>% 
  strapplyc("(.+)[0-9]_[A-Z]$") %>% unlist() %>% as.factor()
stats$individual <- row.names(stats) %>% 
  strapplyc("(.+[0-9])_[A-Z]$") %>% unlist() %>% as.factor()
stats$run <- row.names(stats)

Here’s a PCA of the runs

This sort of thing is the real clincher; it’ll probably be the most important figure in our paper. It will BE the paper, haha.

autoplot(prcomped,
         data=pca.stated %>% set_rownames(pca.stated$run),
         colour="species", label=T,
         loadings = TRUE, loadings.colour = 'blue',
         loadings.label = TRUE, loadings.label.size = 3, size=5, x=1,y=2)
This is made only using variables about the stopping and starting

This is made only using variables about the stopping and starting

autoplot(prcomped,
         data=pca.stated %>% set_rownames(pca.stated$run),
         colour="species", label=T,
         loadings = TRUE, loadings.colour = 'blue',
         loadings.label = TRUE, loadings.label.size = 3, size=5, x=1,y=2)
This was made excluding the standard deviation variables from above.

This was made excluding the standard deviation variables from above.

autoplot(prcomped,
         data=pca.stated %>% set_rownames(pca.stated$run),
         colour="species", label=T,
         loadings = TRUE, loadings.colour = 'blue',
         loadings.label = TRUE, loadings.label.size = 3, size=5, x=1,y=2)
This last one includes EVERYTHING I have, like straightness of the trajectory and the estimated 'dominant frequencies' of their speed over time, etc.

This last one includes EVERYTHING I have, like straightness of the trajectory and the estimated ‘dominant frequencies’ of their speed over time, etc.

ggplot(stats %>% filter(species!="Goldcampo") %>% filter(species!="Smallgoldpoly"),
       aes(y=med_movespeed, x=med_movedurs, color=species)) +
  geom_point(size=6)
## Warning: Removed 1 rows containing missing values (geom_point).

Which variables matter?

Okay, now let’s look at which variables they actually differ in.

  • “med_” means "median value of _" and “durs” means duration. “freqs” refers to the combined stop-start period.
  • “dfreq[123]” refers to the “dominant frequencies” of the time series. It’s some confusing spectral analysis stuff, but it should theoretically describe how the periods of stopping and starting act. I’ve included the first three dominant frequencies.
  • “nobu” (from my labmate Nobuaki) represents the time length of highest rhythmicity, so it’s kinda similar.
stats2 <- stats %>%
  filter(species!="Goldcampo") %>% filter(species!="Smallgoldpoly") %>% 
  mutate(species=factor(species))
for (i in colnames(stats2)) {
    try(plot(stats2[[i]] ~ stats2[['species']], 
             main=paste0("Comparing '", i, "' across strobing species"),
             ylab=i, xlab="species" ))
}
## Error in plot.window(...) : need finite 'ylim' values
stats %>% 
  filter(species!="Goldcampo") %>% filter(species!="Smallgoldpoly") %>% 
  plot(data=., med_movedurs ~ species)





INTERVAL.DB$species <- 
  INTERVAL.DB$filename %>% strapplyc("(.+)[0-9]_[A-Z]$") %>% unlist() %>% as.factor()

stops <- subset(INTERVAL.DB, source == "intervals" )
moves <- subset(INTERVAL.DB, source == "REVERSEintervals" )
stops %>% group_by(species) %>% mutate(duration =  remove_outliers(duration)) %>% na.omit() %>% ungroup() -> stopsgood


# INTERVAL.DB %>% 
#   #filter(duration>0.03) %>% 
#   group_by(species) %>% mutate(duration =  remove_outliers(duration)) %>% na.omit() %>% ungroup() %>% filter(species!="Goldcampo") %>% filter(species!="Smallgoldpoly")  %>% mutate(species = species %>% as.numeric() %>% as.factor())->hey
# 
# 
# 
# ggplot(subset(hey, source=="REVERSEintervals"), aes(x=species, y=duration, fill=filename)) + 
#   geom_boxplot() 



#trajs[[20]] %>% TrajDerivatives() %>% {cbind(.$speed,c(0,.$acceleration))} %>% as.data.frame() %>%  segclust(seg.var=c("V1","V2"), lmin=5, Kmax=50, scale.variable=F, ncluster = c(2)) %>% plot()

Other notes

Also, I think we found (as other people have) that when M.luctuosa pauses, they wave their (“extra”) front legs around like antennae to mimic ants better. However, I think M.bicolor doesn’t do that: they keep their front legs still. Excitingly, when the Opisthopsis stop, unlike other ants, they don’t move their antennae at all. Mimicry!

#___Session info###############

Session info

sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17763)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] segclust2d_0.2.0    ggfortify_0.4.7     gsubfn_0.7         
##  [4] proto_1.0.0         magrittr_1.5        forcats_0.4.0      
##  [7] stringr_1.4.0       dplyr_0.8.3         purrr_0.3.2        
## [10] tidyr_0.8.3         tibble_2.1.3        ggplot2_3.2.0      
## [13] tidyverse_1.2.1     readr_1.3.1         GeneCycle_1.1.4    
## [16] fdrtool_1.2.15      longitudinal_1.1.12 corpcor_1.6.9      
## [19] MASS_7.3-51.4       trajr_1.3.0        
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.1       lubridate_1.7.4  lattice_0.20-38  gtools_3.8.1    
##  [5] assertthat_0.2.1 digest_0.6.20    R6_2.4.0         cellranger_1.1.0
##  [9] backports_1.1.4  signal_0.7-6     evaluate_0.14    httr_1.4.0      
## [13] highr_0.8        pillar_1.4.2     rlang_0.4.0      lazyeval_0.2.2  
## [17] readxl_1.3.1     rstudioapi_0.10  gdata_2.18.0     rmarkdown_1.13  
## [21] labeling_0.3     munsell_0.5.0    broom_0.5.2      compiler_3.6.1  
## [25] modelr_0.1.4     xfun_0.8         pkgconfig_2.0.2  htmltools_0.3.6 
## [29] tcltk_3.6.1      tidyselect_0.2.5 gridExtra_2.3    crayon_1.3.4    
## [33] withr_2.1.2      grid_3.6.1       nlme_3.1-140     jsonlite_1.6    
## [37] gtable_0.3.0     scales_1.0.0     cli_1.1.0        stringi_1.4.3   
## [41] xml2_1.2.0       generics_0.0.2   tools_3.6.1      glue_1.3.1      
## [45] hms_0.4.2        yaml_2.2.0       colorspace_1.4-1 rvest_0.3.4     
## [49] knitr_1.23       haven_2.1.1
 

A work by Andrew T Burchill

aburchil@asu.edu